Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 31
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Hum Genomics ; 18(1): 44, 2024 Apr 29.
Artículo en Inglés | MEDLINE | ID: mdl-38685113

RESUMEN

BACKGROUND: A major obstacle faced by families with rare diseases is obtaining a genetic diagnosis. The average "diagnostic odyssey" lasts over five years and causal variants are identified in under 50%, even when capturing variants genome-wide. To aid in the interpretation and prioritization of the vast number of variants detected, computational methods are proliferating. Knowing which tools are most effective remains unclear. To evaluate the performance of computational methods, and to encourage innovation in method development, we designed a Critical Assessment of Genome Interpretation (CAGI) community challenge to place variant prioritization models head-to-head in a real-life clinical diagnostic setting. METHODS: We utilized genome sequencing (GS) data from families sequenced in the Rare Genomes Project (RGP), a direct-to-participant research study on the utility of GS for rare disease diagnosis and gene discovery. Challenge predictors were provided with a dataset of variant calls and phenotype terms from 175 RGP individuals (65 families), including 35 solved training set families with causal variants specified, and 30 unlabeled test set families (14 solved, 16 unsolved). We tasked teams to identify causal variants in as many families as possible. Predictors submitted variant predictions with estimated probability of causal relationship (EPCR) values. Model performance was determined by two metrics, a weighted score based on the rank position of causal variants, and the maximum F-measure, based on precision and recall of causal variants across all EPCR values. RESULTS: Sixteen teams submitted predictions from 52 models, some with manual review incorporated. Top performers recalled causal variants in up to 13 of 14 solved families within the top 5 ranked variants. Newly discovered diagnostic variants were returned to two previously unsolved families following confirmatory RNA sequencing, and two novel disease gene candidates were entered into Matchmaker Exchange. In one example, RNA sequencing demonstrated aberrant splicing due to a deep intronic indel in ASNS, identified in trans with a frameshift variant in an unsolved proband with phenotypes consistent with asparagine synthetase deficiency. CONCLUSIONS: Model methodology and performance was highly variable. Models weighing call quality, allele frequency, predicted deleteriousness, segregation, and phenotype were effective in identifying causal variants, and models open to phenotype expansion and non-coding variants were able to capture more difficult diagnoses and discover new diagnoses. Overall, computational models can significantly aid variant prioritization. For use in diagnostics, detailed review and conservative assessment of prioritized variants against established criteria is needed.


Asunto(s)
Enfermedades Raras , Humanos , Enfermedades Raras/genética , Enfermedades Raras/diagnóstico , Genoma Humano/genética , Variación Genética/genética , Biología Computacional/métodos , Fenotipo
2.
Anim Microbiome ; 6(1): 17, 2024 Mar 30.
Artículo en Inglés | MEDLINE | ID: mdl-38555432

RESUMEN

BACKGROUND: Antimicrobial resistance has been identified as a major threat to global health. The pig food chain is considered an important source of antimicrobial resistance genes (ARGs). However, there is still a lack of knowledge on the dispersion of ARGs in pig production system, including the external environment. RESULTS: In the present study, we longitudinally followed one swine farm located in Italy from the weaning phase to the slaughterhouse to comprehensively assess the diversity of ARGs, their diffusion, and the bacteria associated with them. We obtained shotgun metagenomic sequences from 294 samples, including pig feces, farm environment, soil around the farm, wastewater, and slaughterhouse environment. We identified a total of 530 species-level genome bins (SGBs), which allowed us to assess the dispersion of microorganisms and their associated ARGs in the farm system. We identified 309 SGBs being shared between the animals gut microbiome, the internal and external farm environments. Specifically, these SGBs were characterized by a diverse and complex resistome, with ARGs active against 18 different classes of antibiotic compounds, well matching antibiotic use in the pig food chain in Europe. CONCLUSIONS: Collectively, our results highlight the urgency to implement more effective countermeasures to limit the dispersion of ARGs in the pig food systems and the relevance of metagenomics-based approaches to monitor the spread of ARGs for the safety of the farm working environment and the surrounding ecosystems.

3.
Nucleic Acids Res ; 52(D1): D494-D501, 2024 Jan 05.
Artículo en Inglés | MEDLINE | ID: mdl-37791887

RESUMEN

MultifacetedProtDB is a database of multifunctional human proteins deriving information from other databases, including UniProt, GeneCards, Human Protein Atlas (HPA), Human Phenotype Ontology (HPO) and MONDO. It collects under the label 'multifaceted' multitasking proteins addressed in literature as pleiotropic, multidomain, promiscuous (in relation to enzymes catalysing multiple substrates) and moonlighting (with two or more molecular functions), and difficult to be retrieved with a direct search in existing non-specific databases. The study of multifunctional proteins is an expanding research area aiming to elucidate the complexities of biological processes, particularly in humans, where multifunctional proteins play roles in various processes, including signal transduction, metabolism, gene regulation and cellular communication, and are often involved in disease insurgence and progression. The webserver allows searching by gene, protein and any associated structural and functional information, like available structures from PDB, structural models and interactors, using multiple filters. Protein entries are supplemented with comprehensive annotations including EC number, GO terms (biological pathways, molecular functions, and cellular components), pathways from Reactome, subcellular localization from UniProt, tissue and cell type expression from HPA, and associated diseases following MONDO, Orphanet and OMIM classification. MultiFacetedProtDB is freely available as a web server at: https://multifacetedprotdb.biocomp.unibo.it/.


Asunto(s)
Bases de Datos de Proteínas , Proteínas , Humanos , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , Bases de Datos como Asunto
4.
Sci Total Environ ; 912: 169086, 2024 Feb 20.
Artículo en Inglés | MEDLINE | ID: mdl-38056648

RESUMEN

Poultry farms are hotspots for the development and spread of antibiotic resistance genes (ARGs), due to high stocking densities and extensive use of antibiotics, posing a threat of spread and contagion to workers and the external environment. Here, we applied shotgun metagenome sequencing to characterize the gut microbiome and resistome of poultry, workers and their households - also including microbiomes from the internal and external farm environment - in three different farms in Italy during a complete rearing cycle. Our results highlighted a relevant overlap among the microbiomes of poultry, workers, and their families (gut and skin), with clinically relevant ARGs and associated mobile elements shared in both poultry and human samples. On a finer scale, the reconstruction of species-level genome bins (SGBs) allowed us to delineate the dynamics of microorganism and ARGs dispersion from farm systems. We found the associations with worker microbiomes representing the main route of ARGs dispersion from poultry to human populations. Collectively, our findings clearly demonstrate the urgent need to implement more effective procedures to counteract ARGs dispersion from poultry food systems and the relevance of metagenomics-based metacommunity approaches to monitor the ARGs dispersion process for the safety of the working environment on farms.


Asunto(s)
Microbiota , Aves de Corral , Animales , Humanos , Granjas , Antibacterianos/farmacología , Farmacorresistencia Microbiana/genética , Genes Bacterianos
5.
Res Sq ; 2023 Aug 02.
Artículo en Inglés | MEDLINE | ID: mdl-37577579

RESUMEN

In the context of the Critical Assessment of the Genome Interpretation, 6th edition (CAGI6), the Genetics of Neurodevelopmental Disorders Lab in Padua proposed a new ID-challenge to give the opportunity of developing computational methods for predicting patient's phenotype and the causal variants. Eight research teams and 30 models had access to the phenotype details and real genetic data, based on the sequences of 74 genes (VCF format) in 415 pediatric patients affected by Neurodevelopmental Disorders (NDDs). NDDs are clinically and genetically heterogeneous conditions, with onset in infant age. In this study we evaluate the ability and accuracy of computational methods to predict comorbid phenotypes based on clinical features described in each patient and causal variants. Finally, we asked to develop a method to find new possible genetic causes for patients without a genetic diagnosis. As already done for the CAGI5, seven clinical features (ID, ASD, ataxia, epilepsy, microcephaly, macrocephaly, hypotonia), and variants (causative, putative pathogenic and contributing factors) were provided. Considering the overall clinical manifestation of our cohort, we give out the variant data and phenotypic traits of the 150 patients from CAGI5 ID-Challenge as training and validation for the prediction methods development.

6.
medRxiv ; 2023 Aug 04.
Artículo en Inglés | MEDLINE | ID: mdl-37577678

RESUMEN

Background: A major obstacle faced by rare disease families is obtaining a genetic diagnosis. The average "diagnostic odyssey" lasts over five years, and causal variants are identified in under 50%. The Rare Genomes Project (RGP) is a direct-to-participant research study on the utility of genome sequencing (GS) for diagnosis and gene discovery. Families are consented for sharing of sequence and phenotype data with researchers, allowing development of a Critical Assessment of Genome Interpretation (CAGI) community challenge, placing variant prioritization models head-to-head in a real-life clinical diagnostic setting. Methods: Predictors were provided a dataset of phenotype terms and variant calls from GS of 175 RGP individuals (65 families), including 35 solved training set families, with causal variants specified, and 30 test set families (14 solved, 16 unsolved). The challenge tasked teams with identifying the causal variants in as many test set families as possible. Ranked variant predictions were submitted with estimated probability of causal relationship (EPCR) values. Model performance was determined by two metrics, a weighted score based on rank position of true positive causal variants and maximum F-measure, based on precision and recall of causal variants across EPCR thresholds. Results: Sixteen teams submitted predictions from 52 models, some with manual review incorporated. Top performing teams recalled the causal variants in up to 13 of 14 solved families by prioritizing high quality variant calls that were rare, predicted deleterious, segregating correctly, and consistent with reported phenotype. In unsolved families, newly discovered diagnostic variants were returned to two families following confirmatory RNA sequencing, and two prioritized novel disease gene candidates were entered into Matchmaker Exchange. In one example, RNA sequencing demonstrated aberrant splicing due to a deep intronic indel in ASNS, identified in trans with a frameshift variant, in an unsolved proband with phenotype overlap with asparagine synthetase deficiency. Conclusions: By objective assessment of variant predictions, we provide insights into current state-of-the-art algorithms and platforms for genome sequencing analysis for rare disease diagnosis and explore areas for future optimization. Identification of diagnostic variants in unsolved families promotes synergy between researchers with clinical and computational expertise as a means of advancing the field of clinical genome interpretation.

7.
Front Mol Biosci ; 10: 1169109, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37234922

RESUMEN

Collectively, rare genetic disorders affect a substantial portion of the world's population. In most cases, those affected face difficulties in receiving a clinical diagnosis and genetic characterization. The understanding of the molecular mechanisms of these diseases and the development of therapeutic treatments for patients are also challenging. However, the application of recent advancements in genome sequencing/analysis technologies and computer-aided tools for predicting phenotype-genotype associations can bring significant benefits to this field. In this review, we highlight the most relevant online resources and computational tools for genome interpretation that can enhance the diagnosis, clinical management, and development of treatments for rare disorders. Our focus is on resources for interpreting single nucleotide variants. Additionally, we present use cases for interpreting genetic variants in clinical settings and review the limitations of these results and prediction tools. Finally, we have compiled a curated set of core resources and tools for analyzing rare disease genomes. Such resources and tools can be utilized to develop standardized protocols that will enhance the accuracy and effectiveness of rare disease diagnosis.

9.
Life Sci Space Res (Amst) ; 36: 47-58, 2023 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-36682829

RESUMEN

Exposure to ionizing radiation is considered by NASA to be a major health hazard for deep space exploration missions. Ionizing radiation sensitivity is modulated by both genomic and environmental factors. Understanding their contributions is crucial for designing experiments in model organisms, evaluating the risk of deep space (i.e. high-linear energy transfer, or LET, particle) radiation exposure in astronauts, and also selecting therapeutic irradiation regimes for cancer patients. We identified single nucleotide polymorphisms in 15 strains of mice, including 10 collaborative cross model strains and 5 founder strains, associated with spontaneous and ionizing radiation-induced in vitro DNA damage quantified based on immunofluorescent tumor protein p53 binding protein (53BP1) positive nuclear foci. Statistical analysis suggested an association with pathways primarily related to cellular signaling, metabolism, tumorigenesis and nervous system damage. We observed different genomic associations in early (4 and 8 h) responses to different LET radiation, while later (24 hour) DNA damage responses showed a stronger overlap across all LETs. Furthermore, a subset of pathways was associated with spontaneous DNA damage, suggesting 53BP1 positive foci as a potential biomarker for DNA integrity in mouse models. Our results suggest several mouse strains as new models to further study the impact of ionizing radiation and validate the identified genetic loci. We also highlight the importance of future human in vitro studies to refine the association of genes and pathways with the DNA damage response to ionizing radiation and identify targets for space travel countermeasures.


Asunto(s)
Daño del ADN , Neoplasias , Humanos , Ratones , Animales , Reparación del ADN , Radiación Ionizante , Genómica
10.
Sci Rep ; 12(1): 17963, 2022 10 26.
Artículo en Inglés | MEDLINE | ID: mdl-36289281

RESUMEN

According to databases such as OMIM, Humsavar, Clinvar and Monarch, 1494 human enzymes are presently associated to 2539 genetic diseases, 75% of which are rare (with an Orphanet code). The Mondo ontology initiative allows a standardization of the disease name into specific codes, making it possible a computational association between genes, variants, diseases, and their effects on biological processes. Here, we tackle the problem of which biological processes enzymes can affect when the protein variant is disease-associated. We adopt Reactome to describe human biological processes, and by mapping disease-associated enzymes in the Reactome pathways, we establish a Reactome-disease association. This allows a novel categorization of human monogenic and polygenic diseases based on Reactome pathways and reactions. Our analysis aims at dissecting the complexity of the human genetic disease universe, highlighting all the possible links within diseases and Reactome pathways. The novel mapping helps understanding the biochemical/molecular biology of the disease and allows a direct glimpse on the present knowledge of other molecules involved. This is useful for a complete overview of the disease molecular mechanism/s and for planning future investigations. Data are collected in DAR, a database that is free for search and available at https://dar.biocomp.unibo.it .


Asunto(s)
Fenómenos Biológicos , Humanos , Bases de Datos Factuales , Biología Computacional
11.
Front Mol Biosci ; 9: 966927, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36188216

RESUMEN

Grouping residue variations in a protein according to their physicochemical properties allows a dimensionality reduction of all the possible substitutions in a variant with respect to the wild type. Here, by using a large dataset of proteins with disease-related and benign variations, as derived by merging Humsavar and ClinVar data, we investigate to which extent our physicochemical grouping procedure can help in determining whether patterns of variation types are related to specific groups of diseases and whether they occur in Pfam and/or InterPro gene domains. Here, we download 75,145 germline disease-related and benign variations of 3,605 genes, group them according to physicochemical categories and map them into Pfam and InterPro gene domains. Statistically validated analysis indicates that each cluster of genes associated to Mondo anatomical system categorizations is characterized by a specific variation pattern. Patterns identify specific Pfam and InterPro domain-Mondo category associations. Our data suggest that the association of variation patterns to Mondo categories is unique and may help in associating gene variants to genetic diseases. This work corroborates in a much larger data set previous observations from our group.

12.
Front Mol Biosci ; 8: 617016, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34026820

RESUMEN

Human genome resequencing projects provide an unprecedented amount of data about single-nucleotide variations occurring in protein-coding regions and often leading to observable changes in the covalent structure of gene products. For many of these variations, links to Online Mendelian Inheritance in Man (OMIM) genetic diseases are available and are reported in many databases that are collecting human variation data such as Humsavar. However, the current knowledge on the molecular mechanisms that are leading to diseases is, in many cases, still limited. For understanding the complex mechanisms behind disease insurgence, the identification of putative models, when considering the protein structure and chemico-physical features of the variations, can be useful in many contexts, including early diagnosis and prognosis. In this study, we investigate the occurrence and distribution of human disease-related variations in the context of Pfam domains. The aim of this study is the identification and characterization of Pfam domains that are statistically more likely to be associated with disease-related variations. The study takes into consideration 2,513 human protein sequences with 22,763 disease-related variations. We describe patterns of disease-related variation types in biunivocal relation with Pfam domains, which are likely to be possible markers for linking Pfam domains to OMIM diseases. Furthermore, we take advantage of the specific association between disease-related variation types and Pfam domains for clustering diseases according to the Human Disease Ontology, and we establish a relation among variation types, Pfam domains, and disease classes. We find that Pfam models are specific markers of patterns of variation types and that they can serve to bridge genes, diseases, and disease classes. Data are available as Supplementary Material for 1,670 Pfam models, including 22,763 disease-related variations associated to 3,257 OMIM diseases.

13.
Int J Mol Sci ; 22(6)2021 Mar 12.
Artículo en Inglés | MEDLINE | ID: mdl-33809039

RESUMEN

Taking advantage of the last cryogenic electron microscopy structure of human huntingtin, we explored with computational methods its physicochemical properties, focusing on the solvent accessible surface of the protein and highlighting a quite interesting mix of hydrophobic and hydrophilic patterns, with the prevalence of the latter ones. We then evaluated the probability of exposed residues to be in contact with other proteins, discovering that they tend to cluster in specific regions of the protein. We then found that the remaining portions of the protein surface can contain calcium-binding sites that we propose here as putative mediators for the protein to interact with membranes. Our findings are justified in relation to the present knowledge of huntingtin functional annotation.


Asunto(s)
Calcio/metabolismo , Biología Computacional , Proteína Huntingtina/química , Proteínas/genética , Sitios de Unión/genética , Humanos , Proteína Huntingtina/genética , Proteína Huntingtina/ultraestructura , Interacciones Hidrofóbicas e Hidrofílicas , Modelos Moleculares , Unión Proteica/genética , Solventes/química , Propiedades de Superficie
14.
Int J Mol Sci ; 23(1)2021 Dec 23.
Artículo en Inglés | MEDLINE | ID: mdl-35008593

RESUMEN

MTHFR deficiency still deserves an investigation to associate the phenotype to protein structure variations. To this aim, considering the MTHFR wild type protein structure, with a catalytic and a regulatory domain and taking advantage of state-of-the-art computational tools, we explore the properties of 72 missense variations known to be disease associated. By computing the thermodynamic ΔΔG change according to a consensus method that we recently introduced, we find that 61% of the disease-related variations destabilize the protein, are present both in the catalytic and regulatory domain and correspond to known biochemical deficiencies. The propensity of solvent accessible residues to be involved in protein-protein interaction sites indicates that most of the interacting residues are located in the regulatory domain, and that only three of them, located at the interface of the functional protein homodimer, are both disease-related and destabilizing. Finally, we compute the protein architecture with Hidden Markov Models, one from Pfam for the catalytic domain and the second computed in house for the regulatory domain. We show that patterns of disease-associated, physicochemical variation types, both in the catalytic and regulatory domains, are unique for the MTHFR deficiency when mapped into the protein architecture.


Asunto(s)
Homocistinuria/genética , Metilenotetrahidrofolato Reductasa (NADPH2)/deficiencia , Espasticidad Muscular/genética , Dominio Catalítico/genética , Humanos , Metilenotetrahidrofolato Reductasa (NADPH2)/genética , Mapas de Interacción de Proteínas/genética , Trastornos Psicóticos/genética
15.
Biomedicines ; 8(8)2020 07 29.
Artículo en Inglés | MEDLINE | ID: mdl-32751059

RESUMEN

Enzymes are key proteins performing the basic functional activities in cells. In humans, enzymes can be also responsible for diseases, and the molecular mechanisms underlying the genotype to phenotype relationship are under investigation for diagnosis and medical care. Here, we focus on highlighting enzymes that are active in different metabolic pathways and become relevant hubs in protein interaction networks. We perform a statistics to derive our present knowledge on human metabolic pathways (the Kyoto Encyclopaedia of Genes and Genomes (KEGG)), and we found that activity aldehyde dehydrogenase (NAD(+)), described by Enzyme Commission number EC 1.2.1.3, and activity acetyl-CoA C-acetyltransferase (EC 2.3.1.9) are the ones most frequently involved. By associating functional activities (EC numbers) to enzyme proteins, we found the proteins most frequently involved in metabolic pathways. With our analysis, we found that these proteins are endowed with the highest numbers of interaction partners when compared to all the enzymes in the pathways and with the highest numbers of predicted interaction sites. As specific enzyme protein test cases, we focus on Alpha-Aminoadipic Semialdehyde Dehydrogenase (ALDH7A1, EC 2.3.1.9) and Acetyl-CoA acetyltransferase, cytosolic and mitochondrial (gene products of ACAT2 and ACAT1, respectively; EC 2.3.1.9). With computational approaches we show that it is possible, by starting from the enzyme structure, to highlight clues of their multiple roles in different pathways and of putative mechanisms promoting the association of genes to disease.

16.
Hum Mutat ; 40(9): 1546-1556, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31294896

RESUMEN

Testing for variation in BRCA1 and BRCA2 (commonly referred to as BRCA1/2), has emerged as a standard clinical practice and is helping countless women better understand and manage their heritable risk of breast and ovarian cancer. Yet the increased rate of BRCA1/2 testing has led to an increasing number of Variants of Uncertain Significance (VUS), and the rate of VUS discovery currently outpaces the rate of clinical variant interpretation. Computational prediction is a key component of the variant interpretation pipeline. In the CAGI5 ENIGMA Challenge, six prediction teams submitted predictions on 326 newly-interpreted variants from the ENIGMA Consortium. By evaluating these predictions against the new interpretations, we have gained a number of insights on the state of the art of variant prediction and specific steps to further advance this state of the art.


Asunto(s)
Proteína BRCA1/genética , Proteína BRCA2/genética , Neoplasias de la Mama/diagnóstico , Biología Computacional/métodos , Neoplasias Ováricas/diagnóstico , Neoplasias de la Mama/genética , Detección Precoz del Cáncer , Femenino , Predisposición Genética a la Enfermedad , Pruebas Genéticas , Variación Genética , Humanos , Modelos Genéticos , Neoplasias Ováricas/genética
17.
BMC Genomics ; 20(Suppl 8): 548, 2019 Jul 16.
Artículo en Inglés | MEDLINE | ID: mdl-31307376

RESUMEN

BACKGROUND: Many diseases are associated with complex patterns of symptoms and phenotypic manifestations. Parsimonious explanations aim at reconciling the multiplicity of phenotypic traits with the perturbation of one or few biological functions. For this, it is necessary to characterize human phenotypes at the molecular and functional levels, by exploiting gene annotations and known relations among genes, diseases and phenotypes. This characterization makes it possible to implement tools for retrieving functions shared among phenotypes, co-occurring in the same patient and facilitating the formulation of hypotheses about the molecular causes of the disease. RESULTS: We introduce PhenPath, a new resource consisting of two parts: PhenPathDB and PhenPathTOOL. The former is a database collecting the human genes associated with the phenotypes described in Human Phenotype Ontology (HPO) and OMIM Clinical Synopses. Phenotypes are then associated with biological functions and pathways by means of NET-GE, a network-based method for functional enrichment of sets of genes. The present version considers only phenotypes related to diseases. PhenPathDB collects information for 18 OMIM Clinical synopses and 7137 HPO phenotypes, related to 4292 diseases and 3446 genes. Enrichment of Gene Ontology annotations endows some 87.7, 86.9 and 73.6% of HPO phenotypes with Biological Process, Molecular Function and Cellular Component terms, respectively. Furthermore, 58.8 and 77.8% of HPO phenotypes are also enriched for KEGG and Reactome pathways, respectively. Based on PhenPathDB, PhenPathTOOL analyzes user-defined sets of phenotypes retrieving diseases, genes and functional terms which they share. This information can provide clues for interpreting the co-occurrence of phenotypes in a patient. CONCLUSIONS: The resource allows finding molecular features useful to investigate diseases characterized by multiple phenotypes, and by this, it can help researchers and physicians in identifying molecular mechanisms and biological functions underlying the concomitant manifestation of phenotypes. The resource is freely available at http://phenpath.biocomp.unibo.it .


Asunto(s)
Ontologías Biológicas , Biología Computacional/métodos , Bases de Datos Genéticas , Fenotipo , Enfermedad/genética , Humanos
18.
Hum Mutat ; 40(9): 1519-1529, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31342580

RESUMEN

The NAGLU challenge of the fourth edition of the Critical Assessment of Genome Interpretation experiment (CAGI4) in 2016, invited participants to predict the impact of variants of unknown significance (VUS) on the enzymatic activity of the lysosomal hydrolase α-N-acetylglucosaminidase (NAGLU). Deficiencies in NAGLU activity lead to a rare, monogenic, recessive lysosomal storage disorder, Sanfilippo syndrome type B (MPS type IIIB). This challenge attracted 17 submissions from 10 groups. We observed that top models were able to predict the impact of missense mutations on enzymatic activity with Pearson's correlation coefficients of up to .61. We also observed that top methods were significantly more correlated with each other than they were with observed enzymatic activity values, which we believe speaks to the importance of sequence conservation across the different methods. Improved functional predictions on the VUS will help population-scale analysis of disease epidemiology and rare variant association analysis.


Asunto(s)
Acetilglucosaminidasa/metabolismo , Biología Computacional/métodos , Mutación Missense , Acetilglucosaminidasa/genética , Humanos , Modelos Genéticos , Análisis de Regresión
19.
Hum Mutat ; 40(9): 1463-1473, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31283071

RESUMEN

This paper reports the evaluation of predictions for the "CALM1" challenge in the fifth round of the Critical Assessment of Genome Interpretation held in 2018. In the challenge, the participants were asked to predict effects on yeast growth caused by missense variants of human calmodulin, a highly conserved protein in eukaryotic cells sensing calcium concentration. The performance of predictors implementing different algorithms and methods is similar. Most predictors are able to identify the deleterious or tolerated variants with modest accuracy, with a baseline predictor based purely on sequence conservation slightly outperforming the submitted predictions. Nevertheless, we think that the accuracy of predictions remains far from satisfactory, and the field awaits substantial improvements. The most poorly predicted variants in this round surround functional CALM1 sites that bind calcium or peptide, which suggests that better incorporation of structural analysis may help improve predictions.


Asunto(s)
Calmodulina/química , Calmodulina/genética , Biología Computacional/métodos , Mutación Missense , Levaduras/crecimiento & desarrollo , Algoritmos , Sitios de Unión , Calcio/metabolismo , Calmodulina/metabolismo , Evolución Molecular , Proteínas Fúngicas/química , Proteínas Fúngicas/genética , Proteínas Fúngicas/metabolismo , Aptitud Genética , Humanos , Modelos Genéticos , Modelos Moleculares , Conformación Proteica , Ingeniería de Proteínas , Levaduras/genética
20.
Hum Mutat ; 40(9): 1373-1391, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31322791

RESUMEN

Whole-genome sequencing (WGS) holds great potential as a diagnostic test. However, the majority of patients currently undergoing WGS lack a molecular diagnosis, largely due to the vast number of undiscovered disease genes and our inability to assess the pathogenicity of most genomic variants. The CAGI SickKids challenges attempted to address this knowledge gap by assessing state-of-the-art methods for clinical phenotype prediction from genomes. CAGI4 and CAGI5 participants were provided with WGS data and clinical descriptions of 25 and 24 undiagnosed patients from the SickKids Genome Clinic Project, respectively. Predictors were asked to identify primary and secondary causal variants. In addition, for CAGI5, groups had to match each genome to one of three disorder categories (neurologic, ophthalmologic, and connective), and separately to each patient. The performance of matching genomes to categories was no better than random but two groups performed significantly better than chance in matching genomes to patients. Two of the ten variants proposed by two groups in CAGI4 were deemed to be diagnostic, and several proposed pathogenic variants in CAGI5 are good candidates for phenotype expansion. We discuss implications for improving in silico assessment of genomic variants and identifying new disease genes.


Asunto(s)
Biología Computacional/métodos , Variación Genética , Enfermedades no Diagnosticadas/diagnóstico , Adolescente , Niño , Preescolar , Simulación por Computador , Bases de Datos Genéticas , Femenino , Predisposición Genética a la Enfermedad , Humanos , Masculino , Fenotipo , Enfermedades no Diagnosticadas/genética , Secuenciación Completa del Genoma
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...